Overview

Dataset statistics

Number of variables15
Number of observations32561
Missing cells0
Missing cells (%)0.0%
Duplicate rows24
Duplicate rows (%)0.1%
Total size in memory3.7 MiB
Average record size in memory120.0 B

Variable types

NUM13
BOOL2

Warnings

Dataset has 24 (0.1%) duplicate rows Duplicates
workclass has 1298 (4.0%) zeros Zeros
education has 5355 (16.4%) zeros Zeros
marital-status has 10683 (32.8%) zeros Zeros
occupation has 3770 (11.6%) zeros Zeros
relationship has 8305 (25.5%) zeros Zeros
race has 27816 (85.4%) zeros Zeros
capital-gain has 29849 (91.7%) zeros Zeros
capital-loss has 31042 (95.3%) zeros Zeros
native-country has 29170 (89.6%) zeros Zeros

Reproduction

Analysis started2021-02-01 08:43:05.004877
Analysis finished2021-02-01 08:43:46.937055
Duration41.93 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

age
Real number (ℝ≥0)

Distinct73
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean38.58164676
Minimum17
Maximum90
Zeros0
Zeros (%)0.0%
Memory size254.4 KiB
2021-02-01T09:43:47.176264image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum17
5-th percentile19
Q128
median37
Q348
95-th percentile63
Maximum90
Range73
Interquartile range (IQR)20

Descriptive statistics

Standard deviation13.64043255
Coefficient of variation (CV)0.3535471837
Kurtosis-0.1661274596
Mean38.58164676
Median Absolute Deviation (MAD)10
Skewness0.5587433694
Sum1256257
Variance186.0614002
MonotocityNot monotonic
2021-02-01T09:43:47.418432image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
368982.8%
 
318882.7%
 
348862.7%
 
238772.7%
 
358762.7%
 
338752.7%
 
288672.7%
 
308612.6%
 
378582.6%
 
258412.6%
 
Other values (63)2383473.2%
 
ValueCountFrequency (%) 
173951.2%
 
185501.7%
 
197122.2%
 
207532.3%
 
217202.2%
 
ValueCountFrequency (%) 
90430.1%
 
883< 0.1%
 
871< 0.1%
 
861< 0.1%
 
853< 0.1%
 

workclass
Real number (ℝ≥0)

ZEROS

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.309972052
Minimum0
Maximum8
Zeros1298
Zeros (%)4.0%
Memory size254.4 KiB
2021-02-01T09:43:47.641484image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q12
median2
Q32
95-th percentile5
Maximum8
Range8
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.225728135
Coefficient of variation (CV)0.5306246602
Kurtosis2.158845812
Mean2.309972052
Median Absolute Deviation (MAD)0
Skewness1.377272436
Sum75215
Variance1.502409462
MonotocityNot monotonic
2021-02-01T09:43:47.801800image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%) 
22269669.7%
 
125417.8%
 
420936.4%
 
518365.6%
 
012984.0%
 
611163.4%
 
39602.9%
 
714< 0.1%
 
87< 0.1%
 
ValueCountFrequency (%) 
012984.0%
 
125417.8%
 
22269669.7%
 
39602.9%
 
420936.4%
 
ValueCountFrequency (%) 
87< 0.1%
 
714< 0.1%
 
611163.4%
 
518365.6%
 
420936.4%
 

fnlwgt
Real number (ℝ≥0)

Distinct21648
Distinct (%)66.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean189778.3665
Minimum12285
Maximum1484705
Zeros0
Zeros (%)0.0%
Memory size254.4 KiB
2021-02-01T09:43:48.025753image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum12285
5-th percentile39460
Q1117827
median178356
Q3237051
95-th percentile379682
Maximum1484705
Range1472420
Interquartile range (IQR)119224

Descriptive statistics

Standard deviation105549.9777
Coefficient of variation (CV)0.5561749721
Kurtosis6.218810978
Mean189778.3665
Median Absolute Deviation (MAD)59894
Skewness1.446980095
Sum6179373392
Variance1.114079779e+10
MonotocityNot monotonic
2021-02-01T09:43:48.269252image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
20348813< 0.1%
 
12301113< 0.1%
 
16419013< 0.1%
 
14899512< 0.1%
 
11336412< 0.1%
 
12112412< 0.1%
 
12667512< 0.1%
 
12656911< 0.1%
 
12398311< 0.1%
 
15565911< 0.1%
 
Other values (21638)3244199.6%
 
ValueCountFrequency (%) 
122851< 0.1%
 
137691< 0.1%
 
148781< 0.1%
 
188271< 0.1%
 
192141< 0.1%
 
ValueCountFrequency (%) 
14847051< 0.1%
 
14554351< 0.1%
 
13661201< 0.1%
 
12683391< 0.1%
 
12265831< 0.1%
 

education
Real number (ℝ≥0)

ZEROS

Distinct16
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.424464851
Minimum0
Maximum15
Zeros5355
Zeros (%)16.4%
Memory size254.4 KiB
2021-02-01T09:43:48.492025image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q35
95-th percentile11
Maximum15
Range15
Interquartile range (IQR)4

Descriptive statistics

Standard deviation3.453581587
Coefficient of variation (CV)1.008502565
Kurtosis1.134389094
Mean3.424464851
Median Absolute Deviation (MAD)2
Skewness1.231511257
Sum111504
Variance11.92722578
MonotocityNot monotonic
2021-02-01T09:43:48.671303image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%) 
11050132.3%
 
5729122.4%
 
0535516.4%
 
317235.3%
 
713824.2%
 
211753.6%
 
610673.3%
 
129332.9%
 
86462.0%
 
105761.8%
 
Other values (6)19125.9%
 
ValueCountFrequency (%) 
0535516.4%
 
11050132.3%
 
211753.6%
 
317235.3%
 
45141.6%
 
ValueCountFrequency (%) 
154331.3%
 
14510.2%
 
131680.5%
 
129332.9%
 
113331.0%
 

education-num
Real number (ℝ≥0)

Distinct16
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.08067934
Minimum1
Maximum16
Zeros0
Zeros (%)0.0%
Memory size254.4 KiB
2021-02-01T09:43:48.852571image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5
Q19
median10
Q312
95-th percentile14
Maximum16
Range15
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.572720332
Coefficient of variation (CV)0.2552129916
Kurtosis0.6234440748
Mean10.08067934
Median Absolute Deviation (MAD)1
Skewness-0.3116758679
Sum328237
Variance6.618889907
MonotocityNot monotonic
2021-02-01T09:43:49.018143image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%) 
91050132.3%
 
10729122.4%
 
13535516.4%
 
1417235.3%
 
1113824.2%
 
711753.6%
 
1210673.3%
 
69332.9%
 
46462.0%
 
155761.8%
 
Other values (6)19125.9%
 
ValueCountFrequency (%) 
1510.2%
 
21680.5%
 
33331.0%
 
46462.0%
 
55141.6%
 
ValueCountFrequency (%) 
164131.3%
 
155761.8%
 
1417235.3%
 
13535516.4%
 
1210673.3%
 

marital-status
Real number (ℝ≥0)

ZEROS

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.083781211
Minimum0
Maximum6
Zeros10683
Zeros (%)32.8%
Memory size254.4 KiB
2021-02-01T09:43:49.200425image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q31
95-th percentile4
Maximum6
Range6
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.251380624
Coefficient of variation (CV)1.154643217
Kurtosis5.556855431
Mean1.083781211
Median Absolute Deviation (MAD)1
Skewness2.155796193
Sum35289
Variance1.565953466
MonotocityNot monotonic
2021-02-01T09:43:49.352844image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%) 
11497646.0%
 
01068332.8%
 
2444313.6%
 
410253.1%
 
69933.0%
 
34181.3%
 
5230.1%
 
ValueCountFrequency (%) 
01068332.8%
 
11497646.0%
 
2444313.6%
 
34181.3%
 
410253.1%
 
ValueCountFrequency (%) 
69933.0%
 
5230.1%
 
410253.1%
 
34181.3%
 
2444313.6%
 

occupation
Real number (ℝ≥0)

ZEROS

Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.666410737
Minimum0
Maximum14
Zeros3770
Zeros (%)11.6%
Memory size254.4 KiB
2021-02-01T09:43:49.521742image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median4
Q37
95-th percentile11
Maximum14
Range14
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.3861192
Coefficient of variation (CV)0.7256367669
Kurtosis-0.5979030147
Mean4.666410737
Median Absolute Deviation (MAD)2
Skewness0.4681046388
Sum151943
Variance11.46580324
MonotocityNot monotonic
2021-02-01T09:43:49.701671image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%) 
3414012.7%
 
6409912.6%
 
1406612.5%
 
0377011.6%
 
5365011.2%
 
4329510.1%
 
920026.1%
 
1118435.7%
 
715974.9%
 
213704.2%
 
Other values (5)27298.4%
 
ValueCountFrequency (%) 
0377011.6%
 
1406612.5%
 
213704.2%
 
3414012.7%
 
4329510.1%
 
ValueCountFrequency (%) 
141490.5%
 
139< 0.1%
 
126492.0%
 
1118435.7%
 
109282.9%
 

relationship
Real number (ℝ≥0)

ZEROS

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.542397347
Minimum0
Maximum5
Zeros8305
Zeros (%)25.5%
Memory size254.4 KiB
2021-02-01T09:43:49.878780image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q33
95-th percentile4
Maximum5
Range5
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.437430913
Coefficient of variation (CV)0.9319459196
Kurtosis-0.5754323471
Mean1.542397347
Median Absolute Deviation (MAD)1
Skewness0.7752642815
Sum50222
Variance2.066207631
MonotocityNot monotonic
2021-02-01T09:43:50.055545image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%) 
11319340.5%
 
0830525.5%
 
3506815.6%
 
4344610.6%
 
215684.8%
 
59813.0%
 
ValueCountFrequency (%) 
0830525.5%
 
11319340.5%
 
215684.8%
 
3506815.6%
 
4344610.6%
 
ValueCountFrequency (%) 
59813.0%
 
4344610.6%
 
3506815.6%
 
215684.8%
 
11319340.5%
 

race
Real number (ℝ≥0)

ZEROS

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.22170695
Minimum0
Maximum4
Zeros27816
Zeros (%)85.4%
Memory size254.4 KiB
2021-02-01T09:43:50.221521image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum4
Range4
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.6273481082
Coefficient of variation (CV)2.829627615
Kurtosis13.92825622
Mean0.22170695
Median Absolute Deviation (MAD)0
Skewness3.520360182
Sum7219
Variance0.3935656489
MonotocityNot monotonic
2021-02-01T09:43:50.382124image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=5)
ValueCountFrequency (%) 
02781685.4%
 
131249.6%
 
210393.2%
 
33111.0%
 
42710.8%
 
ValueCountFrequency (%) 
02781685.4%
 
131249.6%
 
210393.2%
 
33111.0%
 
42710.8%
 
ValueCountFrequency (%) 
42710.8%
 
33111.0%
 
210393.2%
 
131249.6%
 
02781685.4%
 

sex
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size254.4 KiB
0
21790 
1
10771 
ValueCountFrequency (%) 
02179066.9%
 
11077133.1%
 
2021-02-01T09:43:50.524333image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

capital-gain
Real number (ℝ≥0)

ZEROS

Distinct119
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1077.648844
Minimum0
Maximum99999
Zeros29849
Zeros (%)91.7%
Memory size254.4 KiB
2021-02-01T09:43:50.679115image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile5013
Maximum99999
Range99999
Interquartile range (IQR)0

Descriptive statistics

Standard deviation7385.292085
Coefficient of variation (CV)6.853152702
Kurtosis154.7994379
Mean1077.648844
Median Absolute Deviation (MAD)0
Skewness11.95384769
Sum35089324
Variance54542539.18
MonotocityNot monotonic
2021-02-01T09:43:50.917129image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
02984991.7%
 
150243471.1%
 
76882840.9%
 
72982460.8%
 
999991590.5%
 
3103970.3%
 
5178970.3%
 
4386700.2%
 
5013690.2%
 
8614550.2%
 
Other values (109)12884.0%
 
ValueCountFrequency (%) 
02984991.7%
 
1146< 0.1%
 
4012< 0.1%
 
594340.1%
 
9148< 0.1%
 
ValueCountFrequency (%) 
999991590.5%
 
413102< 0.1%
 
340955< 0.1%
 
27828340.1%
 
2523611< 0.1%
 

capital-loss
Real number (ℝ≥0)

ZEROS

Distinct92
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean87.30382973
Minimum0
Maximum4356
Zeros31042
Zeros (%)95.3%
Memory size254.4 KiB
2021-02-01T09:43:51.159699image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum4356
Range4356
Interquartile range (IQR)0

Descriptive statistics

Standard deviation402.9602186
Coefficient of variation (CV)4.615607584
Kurtosis20.37680171
Mean87.30382973
Median Absolute Deviation (MAD)0
Skewness4.594629122
Sum2842700
Variance162376.9378
MonotocityNot monotonic
2021-02-01T09:43:51.379341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
03104295.3%
 
19022020.6%
 
19771680.5%
 
18871590.5%
 
1485510.2%
 
1848510.2%
 
2415490.2%
 
1602470.1%
 
1740420.1%
 
1590400.1%
 
Other values (82)7102.2%
 
ValueCountFrequency (%) 
03104295.3%
 
1551< 0.1%
 
2134< 0.1%
 
3233< 0.1%
 
4193< 0.1%
 
ValueCountFrequency (%) 
43563< 0.1%
 
39002< 0.1%
 
37702< 0.1%
 
36832< 0.1%
 
30042< 0.1%
 

hours-per-week
Real number (ℝ≥0)

Distinct94
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40.43745585
Minimum1
Maximum99
Zeros0
Zeros (%)0.0%
Memory size254.4 KiB
2021-02-01T09:43:51.969773image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile18
Q140
median40
Q345
95-th percentile60
Maximum99
Range98
Interquartile range (IQR)5

Descriptive statistics

Standard deviation12.34742868
Coefficient of variation (CV)0.3053463286
Kurtosis2.916686796
Mean40.43745585
Median Absolute Deviation (MAD)3
Skewness0.2276425368
Sum1316684
Variance152.4589951
MonotocityNot monotonic
2021-02-01T09:43:52.224045image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
401521746.7%
 
5028198.7%
 
4518245.6%
 
6014754.5%
 
3512974.0%
 
2012243.8%
 
3011493.5%
 
556942.1%
 
256742.1%
 
485171.6%
 
Other values (84)567117.4%
 
ValueCountFrequency (%) 
1200.1%
 
2320.1%
 
3390.1%
 
4540.2%
 
5600.2%
 
ValueCountFrequency (%) 
99850.3%
 
9811< 0.1%
 
972< 0.1%
 
965< 0.1%
 
952< 0.1%
 

native-country
Real number (ℝ≥0)

ZEROS

Distinct42
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.290316636
Minimum0
Maximum41
Zeros29170
Zeros (%)89.6%
Memory size254.4 KiB
2021-02-01T09:43:52.468432image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile9
Maximum41
Range41
Interquartile range (IQR)0

Descriptive statistics

Standard deviation5.045373476
Coefficient of variation (CV)3.910182457
Kurtosis25.65586985
Mean1.290316636
Median Absolute Deviation (MAD)0
Skewness4.912896745
Sum42014
Variance25.45579351
MonotocityNot monotonic
2021-02-01T09:43:52.690492image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=42)
ValueCountFrequency (%) 
02917089.6%
 
56432.0%
 
45831.8%
 
131980.6%
 
111370.4%
 
101210.4%
 
71140.4%
 
251060.3%
 
31000.3%
 
1950.3%
 
Other values (32)12944.0%
 
ValueCountFrequency (%) 
02917089.6%
 
1950.3%
 
2810.2%
 
31000.3%
 
45831.8%
 
ValueCountFrequency (%) 
411< 0.1%
 
4013< 0.1%
 
39240.1%
 
38200.1%
 
37670.2%
 

income
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size254.4 KiB
0
24720 
1
7841 
ValueCountFrequency (%) 
02472075.9%
 
1784124.1%
 
2021-02-01T09:43:52.836511image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Interactions

2021-02-01T09:43:08.373589image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:08.639985image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:08.866848image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:09.092297image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:09.311624image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:09.526852image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:09.750629image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:09.965875image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:10.180212image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:10.407539image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:10.625244image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:10.838711image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:11.052233image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:11.268539image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:11.495750image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:11.734378image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:11.969599image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:12.205014image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:12.434662image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:13.274253image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:13.500496image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:13.724594image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:13.955180image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:14.170798image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:14.379240image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:14.596715image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:14.805283image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:15.026492image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:15.256634image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:15.479704image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:15.714513image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:15.939809image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:16.158700image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:16.393479image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:16.633860image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:16.860013image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:17.107527image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:17.325876image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:17.544571image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:17.761950image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:17.976310image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:18.195919image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:18.420888image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:18.640717image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:18.860710image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:19.073955image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:19.292753image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:19.509329image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:19.744688image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:19.956824image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:20.167386image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:20.376425image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:20.735930image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:20.949550image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:21.174480image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:21.402041image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:21.627002image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:21.849611image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:22.067023image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:22.294018image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:22.502008image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:22.723756image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:22.940442image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:23.147119image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:23.355330image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:23.567463image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:23.781237image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:23.996815image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:24.213527image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:24.429653image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:24.644693image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:24.850088image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:25.061989image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:25.267959image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:25.489975image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:25.731555image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:25.966145image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:26.196517image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:26.416333image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:26.666759image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:26.895391image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:27.119133image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:27.362421image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:27.589319image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:27.805921image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:28.029673image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:28.234490image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:28.464283image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:28.675752image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:28.885917image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:29.098405image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:29.299684image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:29.514795image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:29.920285image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:30.136296image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:30.337606image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:30.548535image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:30.745662image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:30.947996image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:31.146968image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:31.362009image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:31.563982image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:31.756367image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:31.967937image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:32.165879image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:32.408694image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:32.631791image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:32.854221image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:33.087691image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:33.310353image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:33.536173image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:33.764039image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:33.979262image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:34.207527image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:34.430627image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:34.648425image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:34.863420image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:35.095073image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:35.299875image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:35.512733image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:35.727388image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:35.937948image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:36.140825image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:36.340934image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:36.583594image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:36.783449image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:36.999610image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:37.197120image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:37.387177image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:37.595749image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:37.789419image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:37.990836image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:38.192435image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:38.394250image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:38.603950image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:38.811370image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:39.010674image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:39.203701image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:39.395696image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:39.605322image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:39.795713image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:39.979842image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:40.165816image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:40.355094image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:40.808384image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:41.023217image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:41.237080image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:41.446069image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:41.651547image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:41.880608image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:42.089142image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:42.295084image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:42.524399image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:42.730290image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:42.920894image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:43.116406image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:43.302838image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:43.495629image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:43.694831image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:43.901340image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:44.101487image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:44.297766image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:44.492871image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:44.694972image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:44.885089image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:45.088961image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:45.276723image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:45.461287image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:45.658413image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2021-02-01T09:43:52.966101image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-02-01T09:43:53.330172image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-02-01T09:43:53.693228image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-02-01T09:43:54.045637image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-02-01T09:43:46.071176image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-01T09:43:46.644032image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Sample

First rows

ageworkclassfnlwgteducationeducation-nummarital-statusoccupationrelationshipracesexcapital-gaincapital-losshours-per-weeknative-countryincome
03907751601300000217404000
15018331101311100001300
23822156461922000004000
35322347212712110004000
428233840901313211004010
537228458231411201004000
64921601874534011001620
75212096421911100004501
831245781314030011408405001
942215944901311100517804001

Last rows

ageworkclassfnlwgteducationeducation-nummarital-statusoccupationrelationshipracesexcapital-gaincapital-losshours-per-weeknative-countryincome
325513223406612612130004000
325524328466171115100004500
325533221161383140100200011210
3255453232186531411100004001
32555222310152510012000004000
32556272257302612110201003800
325574021543741919100004001
325585821519101960401004000
325592222014901900300002000
3256052628792719112011502404001

Duplicate rows

Most frequent

ageworkclassfnlwgteducationeducation-nummarital-statusoccupationrelationshipracesexcapital-gaincapital-losshours-per-weeknative-countryincomecount
825219599413201400100402703
01929726119080000040002
1192138153510003010010002
2192146679510013100030002
3192251579510043000014002
42021076585100100010010002
5212243368141080000050502
6212250051510033010010002
7232240137113020000055502
9252308144013060000040502